Method for detecting transcription templates

ABSTRACT

Methods are provided for detecting the sense and antisense transcripts and for determining template strand of a genomic DNA. Exemplary methods include reverse transcribing transcripts without second strand synthesis. The resulting single stranded stranded DNA is labeled and detected using nucleic acid probe arrays. In a particularly preferred embodiment, actinomycin is used to inhibit the synthesis of second strand cDNA during reverse transcription.

BACKGROUND OF THE INVENTION

[0001] The present invention is in the field of genetic analysis formedical diagnosis, genetic variation research, or genetic engineering.More specifically, the present invention is in the field of nucleic acidanalysis.

[0002] For many studies involving microarrays, labeled cDNA is oftenused as a target.

[0003] This cDNA can be synthesized through either oligo d(T) primerswhich bind to the poly (A) tail in eukaryotic mRNA or through randomprimers, in which the actual binding sequences are not known. It isknown that during in vitro reverse transcription of RNA, not only thefirst-strand cDNA is synthesized but also the second-strand cDNA, asreverse transcriptase can use either RNA or DNA as a template (see,e.g., Gubler, 1987. Second-strand cDNA synthesis: classical method.Methods Enzymol. 152:325-9; Gubler, 1987. Second-strand cDNA synthesis:mRNA fragments as primers. Methods Enzymol. 152:330-5; Kim et al., 1996.Human immunodeficiency virus reverse transcriptase. Functional mutantsobtained by random mutagenesis coupled with genetic selection inEscherichia coli. J Biol Chem. 271(9):4872-8; Krug, M. S., and S. L.Berger, 1987. First-strand cDNA synthesis primed with oligo(dT). MethodsEnzymol. 152:316-25). There may be many mechanisms by which thissecond-strand priming occurs. Two possible mechanisms have been studied,either the second strand cDNA is synthesized through re-priming ofrandom hexamers with first strand cDNA or through the hairpin loopformation at the 5″ end of first-strand cDNA.

[0004] High-density oligonucleotides have been widely used for geneexpression analysis. In addition, it is an ideal platform for otherapplications like transcriptome analysis, antisense detection, splicevariant detection, genotyping, etc. Some of these applications userandom hexamer cDNA synthesis for target preparation. The synthesis ofsecond strand cDNA would make the data analysis complicated due to theadditional strand synthesis (e.g., antisense RNA could not beidentified). Therefore, there is a need in the art for methods that canuniquely identify the sense strand. In addition, methods for identifyingthe template strand of a genomic DNA are needed.

SUMMARY OF THE INVENTION

[0005] In one aspect of the invention, methods are provided fordetecting a plurality of transcripts without the interference of secondstrand DNA. The method include synthesizing a plurality of cDNAscomplementary with the transcripts by reverse transcription; where thesynthesis of second strand cDNA is inhibited; and hybridizing the cDNAsor nucleic acids derived from the cDNAs with a nucleic acid probe arrayto detect and identify the transcripts. The methods are particularlysuitable for detecting a large number of, at least 100, 1000, or 10000,transcripts. Any suitable second strand cDNA synthesis inhibitionmethods are suitable for use with at least some embodiments of theinvention. In particularly preferred embodiment, hairpin loop formationinhibition is used to inhibit second strand cDNA synthesis. In oneparticularly preferred embodiment, the synthesis of the second strandcDNA is inhibited by the presence of actinomycin D, DMSO or sodiumpyrophosphate. The cDNAs or nucleic acids derived from the cDNAs (e.g.,products of PCR amplification of the cDNAs, etc.) may be labeled withany suitable labels, such as radioactive labels, fluorescent labels, andchemoluminescent labels, etc.

[0006] The nucleic acid array can be a high density oligonucleotideprobe array with at least 400, 1000, 10000 probes per cm². In preferredembodiments, the array contains at least one probe against a targetsequence and one probe against the reverse complementary sequence of thetarget sequence. In more preferred embodiments, the array contains atleast 100 probes against at least 100 target sequences and at least 100probes against at least 100 reverse complementary sequences of thetarget sequences. In even more preferred embodiments, the arraycomprises at least 1000 or 3000 probes against at least 1000 or 3000target sequences and at least 1000 or 3000 probes against at least 1000or 3000 reverse complementary sequences of the target sequences.

[0007] In another aspect of the invention, methods are provided fordetecting the transcribed regions of a genome and the template strand ofthe genomic DNA. The methods are particularly suitable for analyzingregions where both strands of the genomic DNA may be transcribed. Inpreferred embodiments, the methods include obtaining a sample containingtranscripts transcribed from the genome; synthesizing single strandedcDNAs complementary with the transcripts, where the synthesis of secondstrand cDNA is inhibited; and hybridizing the cDNAs or nucleic acidsderived from the cDNAs with a nucleic acid probe array, where thenucleic acid probe array has probes targeting both strands of thegenomic DNA in interested regions. Any suitable second strand cDNAsynthesis inhibition methods are suitable for use with at least someembodiments of the invention. In particularly preferred embodiment,hairpin loop formation inhibition is used to inhibit second strand cDNAsynthesis. In one particularly preferred embodiment, the synthesis ofthe second strand cDNA is inhibited by the presence of actinomycin D.The cDNAs or nucleic acids derived from the cDNAs (e.g., products of PCRamplification of the cDNAs, etc.) may be labeled with any suitablelabels, such as radioactive labels, fluorescent labels, andchemoluminescent labels, etc. The nucleic acid array can be a highdensity oligonucleotide probe array with at least 400, 1000, 10000probes per cm². In preferred embodiments, the array contains at leastone probe against a target sequence and one probe against the reversecomplementary sequence of the target sequence. In more preferredembodiments, the array contains at least 100 probes against at least 100target sequences and at least 100 probes against at least 100 reversecomplementary sequences of the target sequences. In even more preferredembodiments, the array comprises at least 1000 or 3000 probes against atleast 1000 or 3000 target sequences and at least 1000 or 3000 probesagainst at least 1000 or 3000 reverse complementary sequences of thetarget sequences.

[0008] In yet another aspect of the invention, an assay kit is provided.The kit contains reagents necessary for a reverse transcriptionreaction; an inhibitor of second strand cDNA synthesis; and a nucleicacid probe array. In preferred embodiments, the inhibitor is actinomycinD. The nucleic acid probe array is an oligonucleotide probe array thathas at least 400, 1000, 10000 probes per cm².

BRIEF DESCRIPTION OF DRAWINGS

[0009] The accompanying drawings, which are incorporated in and form apart of this specification, illustrate embodiments of the invention and,together with the description, serve to explain the principles of theinvention:

[0010]FIG. 1 is a schematic showing the role of hairpin loop in cDNAsynthesis.

[0011]FIG. 2 is a schematic showing a probe array containing probesagainst both potential transcripts from both strand of the genomic DNA.

DETAILED DESCRIPTION

[0012] Reference will now be made in detail to the preferred embodimentsof the invention. While the invention will be described in conjunctionwith the preferred embodiments, it will be understood that they are notintended to limit the invention to these embodiments. On the contrary,the invention is intended to cover alternatives, modifications andequivalents, which may be included within the spirit and scope of theinvention.

[0013] General

[0014] The present invention relies on many patents, applications andother references for certain details well known to those of the art.Therefore, when a patent, application, or other reference is cited orrepeated below, it should be understood that that it is incorporated byreference in its entirety for all purposes as well as for theproposition that is recited.

[0015] As used in the specification and claims, the singular form a, an,and the include plural references unless the context clearly dictatesotherwise. For example, the term an agent” includes a plurality ofagents, including mixtures thereof.

[0016] An individual is not limited to a human being but may also beother organisms including but not limited to mammals, plants, bacteria,or cells derived from any of the above.

[0017] Throughout this disclosure, various aspects of this invention arepresented in a range format. It should be understood that thedescription in range format is merely for convenience and brevity andshould not be construed as an inflexible limitation on the scope of theinvention. Accordingly, the description of a range should be consideredto have specifically disclosed all the possible subranges as well asindividual numerical values within that range. For example, descriptionof a range such as from 1 to 6 should be considered to have specificallydisclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numberswithin that range, for example, 1, 2, 3, 4, 5, and 6. This appliesregardless of the breadth of the range.

[0018] The practice of the present invention may employ, unlessotherwise indicated, conventional techniques of organic chemistry,polymer technology, molecular biology (including recombinanttechniques), cell biology, biochemistry, and immunology, which arewithin the skill of the art. Such conventional techniques includepolymer array synthesis, hybridization, ligation, detection ofhybridization using a label. Such conventional techniques can be foundin standard laboratory manuals such as Genome Analysis: A LaboratoryManual Series (Vols. I-IV), Using Antibodies: A Laboratory Manual,Cells: A Laboratory Manual, PCR Primer: A Laboratory Manual, andMolecular Cloning: A Laboratory Manual (all from Cold Spring HarborLaboratory Press), all of which are herein incorporated in theirentirety by reference for all purposes.

[0019] Additional methods and techniques applicable to array synthesishave been described in U.S. Pat. Nos. 5,143,854, 5,242,974, 5,252,743,5,324,633, 5,384,261, 5,405,783, 5,412,087, 5,424,186, 5,445,934,5,451,683, 5,482,867, 5,489,678, 5,491,074, 5,510,270, 5,527,681,5,550,215, 5,571,639, 5,578,832, 5,593,839, 5,599,695, 5,624,711,5,631,734, 5,677,195, 5,744,101, 5,744,305, 5,770,456, 5,795,716,5,800,992, 5,831,070, 5,837,832, 5,856,101, 5,871,928, 5,858,659,5,936,324, 5,968,740, 5,974,164, 5,981,185, 5,981,956, 6,025,601,6,033,860, 6,040,138, and 6,090,555, which are all incorporated hereinby reference in their entirety for all purposes.

[0020] Analogue when used in conjunction with a biomonomer or abiopolymer refers to natural and un-natural variants of the particularbiomonomer or biopolymer. For example, a nucleotide analogue includesinosine and dideoxynucleotides. A nucleic acid analogue includes peptidenucleic acids. The foregoing is not intended to be exhaustive but ratherrepresentative. More information can be found in U.S. patent applicationSer. No. 80/630,427.

[0021] Complementary or substantially complementary: Refers to thehybridization or base pairing between nucleotides or nucleic acids, suchas, for instance, between the two strands of a double stranded DNAmolecule or between an oligonucleotide primer and a primer binding siteon a single stranded nucleic acid to be sequenced or amplified.Complementary nucleotides are, generally, A and T (or A and U), or C andG. Two single stranded RNA or DNA molecules are said to be substantiallycomplementary when the nucleotides of one strand, optimally aligned andcompared and with appropriate nucleotide insertions or deletions, pairwith at least about 80% of the nucleotides of the other strand, usuallyat least about 90% to 95%, and more preferably from about 98 to 100%.Alternatively, substantial complementarity exists when an RNA or DNAstrand will hybridize under selective hybridization conditions to itscomplement. Typically, selective hybridization will occur when there isat least about 65% complementarity over a stretch of at least 14 to 25nucleotides, preferably at least about 75%, more preferably at leastabout 90% complementarity. See e. g., M. Kanehisa Nucleic Acids Res.12:203 (1984), incorporated herein by reference.

[0022] Hybridization refers to the process in which two single-strandedpolynucleotides bind non-covalently to form a stable double-strandedpolynucleotide; triple-stranded hybridization is also theoreticallypossible. The resulting (usually) double-stranded polynucleotide is ahybrid. The proportion of the population of polynucleotides that formsstable hybrids is referred to herein as the degree of hybridization.Hybridizations are usually performed under stringent conditions, forexample, at a salt salt concentration of no more than 1 M and atemperature of at least 25 E C. For example, conditions of 5×SSPE(750NaCl, 50NaPhosphate, 5EDTA, pH 7.4) and a temperature of 25-30° C.are suitable for allele-specific probe hybridizations. For stringentconditions, see, for example, Sambrook, Fritsche and Maniatis. MolecularCloning A laboratory Manual 2^(nd) Ed. Cold Spring Harbor Press (1989)which is hereby incorporated by reference in its entirety for allpurposes above.

[0023] Nucleic acid refers to a polymeric form of nucleotides of anylength, such as oligonucleotides or polynucleotides, eitherribonucleotides, deoxyribonucleotides or peptide nucleic acids (PNAs),that comprise purine and pyrimidine bases, or other natural, chemicallyor biochemically modified, non-natural, or derivatized nucleotide bases.The backbone of the polynucleotide can comprise sugars and phosphategroups, as may typically be found in RNA or DNA, or modified orsubstituted sugar or phosphate groups. A polynucleotide may comprisemodified nucleotides, such as methylated nucleotides and nucleotideanalogs. The sequence of nucleotides may be interrupted bynon-nucleotide components. Thus the terms nucleoside, nucleotide,deoxynucleoside and deoxynucleotide generally include analogs such asthose described herein. These analogs are those molecules having somestructural features in common with a naturally occurring nucleoside ornucleotide such that when incorporated into a nucleic acid oroligonucleoside sequence, they allow hybridization with a naturallyoccurring nucleic acid sequence in solution. Typically, these analogsare derived from naturally occurring nucleosides and nucleotides byreplacing and/or modifying the base, the ribose or the phosphodiestermoiety. The changes can be customized to stabilize or destabilize hybridformation or enhance the specificity of hybridization with acomplementary nucleic acid sequence as desired.

[0024] Oligonucleotide or polynucleotide is a nucleic acid ranging fromat least 2, preferable at least 8, and more preferably at least 20nucleotides in length or a compound that specifically hybridizes to apolynucleotide. Polynucleotides of the present invention includesequences of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) ormimetics thereof which may be isolated from natural sources,recombinantly produced or artificially synthesized. A further example ofa polynucleotide of the present invention may be a peptide nucleic acid(PNA). The invention also encompasses situations in which there is anontraditional base pairing such as Hoogsteen base pairing which hasbeen identified in certain tRNA molecules and postulated to exist in atriple helix. Polynucleotide and oligonucleotide are usedinterchangeably in this application.

[0025] Polymorphism refers to the occurrence of two or more geneticallydetermined alternative sequences or alleles in a population. Apolymorphic marker or site is the locus at which divergence occurs.Preferred markers have at least two alleles, each occurring at frequencyof greater than 1%, and more preferably greater than 10% or 20% of aselected population. A polymorphism may comprise one or more basechanges, an insertion, a repeat, or a deletion. A polymorphic locus maybe as small as one base pair. Polymorphic markers include restrictionfragment length polymorphisms, variable number of tandem repeats(VNTR's), hypervariable regions, minisatellites, dinucleotide repeats,trinucleotide repeats, tetranucleotide repeats, simple sequence repeats,and insertion elements such as Alu. The first identified allelic form isarbitrarily designated as the reference form and other allelic forms aredesignated as alternative or variant alleles. The allelic form occurringmost frequently in a selected population is sometimes referred to as thewildtype form. Diploid organisms may be homozygous or heterozygous forallelic forms. A diallelic polymorphism has two forms. A triallelicpolymorphism has three forms.

[0026] Primer is a single-stranded oligonucleotide capable of acting asa point of initiation for template-directed DNA synthesis under suitableconditions, e.g., buffer and temperature, in the presence of fourdifferent nucleoside triphosphates and an agent for polymerization, suchas, for example, DNA or RNA polymerase or reverse transcriptase. Thelength of the primer, in any given case, depends on, for example, theintended use of the primer, and generally ranges from 3 to 6 and up to30 or 50 nucleotides. Short primer molecules generally require coolertemperatures to form sufficiently stable hybrid complexes with thetemplate. A primer needs not reflect the exact sequence of the templatebut must be sufficiently complementary to hybridize with such template.The primer site is the area of the template to which a primerhybridizes. The primer pair is a set of primers including a 5′ upstreamprimer that hybridizes with the 5′ end of the sequence to be amplifiedand a 3′ downstream primer that hybridizes with the complement of the 3′end of the sequence to be amplified.

[0027] Substrate refers to a material or group of materials having arigid or semi-rigid surface or surfaces. In many embodiments, at leastone surface of the solid support will be substantially flat, although insome embodiments it may be desirable to physically separate synthesisregions for different compounds with, for example, wells, raisedregions, pins, etched trenches, or the like. According to otherembodiments, the solid support(s) will take the form of beads, resins,gels, microspheres, or other geometric configurations.

[0028] High density nucleic acid probe arrays, also referred to as DNAMicroarrays, have become a method of choice for monitoring theexpression of a large number of genes.

[0029] A target molecule refers to a biological molecule of interest.The biological molecule of interest can be a ligand, receptor, peptide,nucleic acid (oligonucleotide or or polynucleotide of RNA or DNA), orany other of the biological molecules listed in U.S. Pat. No. 5,445,934at col. 5, line 66 to col. 7, line 51. For example, if transcripts ofgenes are the interest of an experiment, the target molecules would bethe transcripts. Other examples include protein fragments, smallmolecules, etc. Target nucleic acid refers to a nucleic acid (oftenderived from a biological sample) of interest. Frequently, a targetmolecule is detected using one or more probes. As used herein, a probeis a molecule for detecting a target molecule. It can be any of themolecules in the same classes as the target referred to above. A probemay refer to a nucleic acid, such as an oligonucleotide, capable ofbinding to a target nucleic acid of complementary sequence through oneor more types of chemical bonds, usually through complementary basepairing, usually through hydrogen bond formation. As used herein, aprobe may include natural ( ie. A, G, U, C, or T) or modified bases(7-deazaguanosine, inosine, etc.). In addition, the bases in probes maybe joined by a linkage other than a phosphodiester bond, so long as thebond does not interfere with hybridization. Thus, probes may be peptidenucleic acids in which the constituent bases are joined by peptide bondsrather than phosphodiester linkages. Other examples of probes includeantibodies used to detect peptides or other molecules, any ligands fordetecting its binding partners. When referring to targets or probes asnucleic acids, it should be understood that there are illustrativeembodiments that are not to limit the invention in any way.

[0030] In preferred embodiments, probes may be immobilized on substratesto create an array. An array may comprise a solid support with peptideor nucleic acid or other molecular probes attached to the support.Arrays typically comprise a plurality of different nucleic acids orpeptide probes that are coupled to a surface of a substrate in indifferent, known locations. These arrays, also described as“microarrays” or colloquially “chips” have been generally described inthe art, for example, in Fodor et al., Science, 251:767-777 (1991),which is incorporated by reference for all purposes. Methods of forminghigh density arrays of oligonucleotides, peptides and other polymersequenes with a minimal number of synthetic steps are disclosed in, forexample, U.S. Pat. Nos. 5,143,854, 5,252,743, 5,384,261, 5,405,783,5,424,186, 5,429,807, 5,445,943, 5,510,270, 5,677,195, 5,571,639,6,040,138, all incorporated herein by reference for all purposes. Theoligonucleotide analogue array can be synthesized on a solid substrateby a variety of methods, including, but not limited to, light-directedchemical coupling, and mechanically directed coupling. See Pirrung etal., U.S. Pat. No. 5,143,854 (see also PCT Application No. WO 90/15070)and Fodor et al., PCT Publication Nos. WO 92/10092 and WO 93/09668, U.S.Pat. Nos. 5,677,195, 5,800,992 and 6,156,501 which disclose methods offorming vast arrays of peptides, oligonucleotides and other moleculesusing, for example, light-directed synthesis techniques. See also, Fodoret al., Science, 251, 767-77 (1991). These procedures for synthesis ofpolymer arrays are now referred to as VLSIPS™ procedures. Using theVLSIPS™ approach, one heterogeneous array of polymers is converted,through simultaneous coupling at a number of reaction sites, into adifferent heterogeneous array. See, U.S. Pat. Nos. 5,384,261 and5,677,195.

[0031] Methods for making and using molecular probe arrays, particularlynucleic acid probe arrays are also disclosed in, for example, U.S. Pat.Nos. 5,143,854, 5,242,974, 5,252,743, 5,324,633, 5,384,261, 5,405,783,5,409,810, 5,412,087, 5,424,186, 5,429,807, 5,445,934, 5,451,683,5,482,867, 5,489,678, 5,491,074, 5,510,270, 5,527,681, 5,527,681,5,541,061, 5,550,215, 5,554,501, 5,556,752, 5,556,961, 5,571,639,5,583,211, 5,593,839, 5,599,695, 5,607,832, 5,624,711, 5,677,195,5,744,101, 5,744,305, 5,753,788, 5,770,456, 5,770,722, 5,831,070,5,856,101, 5,885,837, 5,889,165, 5,919,523, 5,922,591, 5,925,517,5,658,734, 6,022,963, 6,150,147, 6,147,205, 6,153,743, 6,140,044 andD430024, all of which are incorporated by reference in their entiretiesfor all purposes.

[0032] Methods for signal detection and processing of intensity data areadditionally disclosed in, for example, U.S. Pat. Nos. 5,547,839,5,578,832, 5,631,734, 5,800,992, 5,856,092, 5,936,324, 5,981,956,6,025,601, 6,090,555, 6,141,096, 6,141,096, and 5,902,723. Methods forarray based assays, computer software for data analysis and applicationsare additionally disclosed in, e.g., U.S. Pat. Nos. 5,527,670,5,527,676, 5,545,531, 5,622,829, 5,631,128, 5,639,423, 5,646,039,5,650,268, 5,654,155, 5,674,742, 5,710,000, 5,733,729, 5,795,716,5,814,450, 5,821,328, 5,824,477, 5,834,252, 5,834,758, 5,837,832,5,843,655, 5,856,086, 5,856,104, 5,856,174, 5,858,659, 5,861,242,5,869,244, 5,871,928, 5,874,219, 5,902,723, 5,925,525, 5,928,905,5,935,793, 5,945,334, 5,959,098, 5,968,730, 5,968,740, 5,974,164,5,981,174, 5,981,185, 5,985,651, 6,013,440, 6,013,449, 6,020,135,6,027,880, 6,027,894, 6,033,850, 6,033,860, 6,037,124, 6,040,138,6,040,193, 6,043,080, 6,045,996, 6,050,719, 6,066,454, 6,083,697,6,114,116, 6,114,122, 6,121,048, 6,124,102, 6,130,046, 6,132,580,6,132,996, 6,136,269 and attorney docket numbers 3298.1 and 3309, all ofwhich are incorporated by reference in their entireties for allpurposes.

[0033] The embodiments of the invention will be described usingGeneChip® high oligonucleotide density probe arrays (available fromAffymetrix, Inc., Santa Clara, Calif., USA) as exemplary embodiments.One of skill the art would appreciate that the embodiments of theinvention are not limited to high density oligonucleotide probe arrays.In contrast, the embodiments of the invention are useful for analyzingany parallel large scale biological analysis, such as those usingnucleic acid probe array, protein arrays, etc.

[0034] Gene expression monitoring using GeneChip® high densityoligonucleotide probe arrays are described in, for example, Lockhart etal., 1996, Expression Monitoring By Hybridization to High DensityOligonucleotide Arrays, Nature Biotechnology 14:1675-1680; U.S. Pat.Nos. 6,040,138 and 5,800,992, all incorporated herein by reference intheir entireties for all purposes.

[0035] Detection of Sense and Antisense Transcripts

[0036] Transcription entails the synthesis of a single-strandedpolynucleotide of RNA at an unwound section of DNA with one of the DNAstrands serving as a template for the synthesis of the RNA. The productof this process is called an RNA transcript. RNAs can be transcribedfrom either stand or both stands of the genomic DNA. In some instances,both strands of the same genomic DNA region may be transcribed. The term“template strand,” as used herein, refers to the genomic DNA strand usedas a template for a RNA transcript. The reverse complementary strand ofthe template strand is referred to as reverse strand. Because bothstrands can be used as templates, the terms “template strand” and“reverse strand,” as used herein, are often relative to particulartranscripts.

[0037] As used herein, the term “sense strand” refers to the genomic DNAstrand which is identical in sequence to the RNA transcribed. The actualtemplate (template strand) for the transcription is the reverse strandof the sense strand. An antisense strand is the template strand for thetranscript.

[0038] It is well known that both the sense and antisense transcripts ofcertain genes may encode proteins or regulate gene activities. Oneexample of the sense and antisense transcription is the gene forneurofibromin, a tumor suppressor protein that is absent or inactivatedin neurofibromatosis type 1 (NF1), an inherited illness that causes‘cafe-au-lait’ spots on the skin and tumors beneath the skin. Within anintron of the neurofibromin gene, but encoded on the antisense strand ofthe DNA, are codons for three other proteins: oligodendrocyte-myelinglycoprotein which may control cell proliferation and two homologs of amouse gene that causes myeloid leukemia.

[0039] Nucleic acid probe arrays have been used to monitor a largenumber of transcripts simultaneous and are also being used tointerrogate the genome for potential transcripts. In many instances,probes against both the sense and antisense transcripts or potentialtranscripts are detected simultaneously. Some of these applications userandom hexamer or nanomer, or specific primers for cDNA synthesis fortarget preparation. As FIG. 1 shows, in addition to first strand cDNAsynthesis, a second strand cDNA may be synthesized as well, using thehairpin loop as the primer. The second strand cDNA synthesis could makethe data analysis complicated due to the additional strand synthesis,particularly if a probe array contains probes against both the sense andantisense transcript (see, FIG. 2). For example, in a case where thesense strand transcript, but not the antisense transcript, is present ina sample, a probe array against the antisense transcript may detect thesecond strand synthesized. Both the sense and antisense probes may showsignals. Similarly, if the transcript present in the sample is anantisense transcript, the probes targeting both the sense and antisensetranscripts may show signals, which could complicated data analysis.

[0040] The inventors have experimentally shown that second strand cDNAsynthesis is mostly triggered by the hairpin loop formation at the 5″end of first-strand cDNA and not through repriming of cDNA with randomhexamer primers. In one aspect of the invention, methods are providedfor inhibiting the synthesis of the second-strand cDNA synthesis and toimprove the detection of sense and antisense transcripts, particularlywhen probes targeting sense and antisense transcripts are usedsimultaneously. The methods are particularly useful for interrogatingthe genome for potential transcripts. In such cases, because both strandof the genomic DNA can be used as templates, probes against potentialtranscripts from both strands are often used to determine potentiallytranscribed regions. In some embodiments of the invention, methods areprovided to determine the template strand of the potential transcripts.The method include preparing cDNAs from a transcript sample while thehairpin formation or second strand cDNA synthesis is inhibited. ThecDNAs or nucleic acids are hybridized to a nucleic acid probe array. Thearray may contain probes against both strand of the genomic DNA. Thehybridization data are used to analyze not only which region of thegenome is transcribed, but also which strand of the genomic DNA is usedas a template for a detected transcript.

[0041] Methods are also provided for detecting the expression of genesthat have both sense and antisense transcripts. In such methods, probesagainst both the sense and antisense transcripts are usedsimultaneously. The signals from the sense and antisense probes are usedto determine the relative level of the sense and antisense transcripts.If the second strand cDNA synthesis is not inhibited, both the sense andantisense probes may detect either sense or antisense transcripts, whichmakes data interpretation much more complicated.

[0042] The methods have applications in areas such as drug discovery anddiagnostics. For example, new transcripts detected may serve aspotential drug target.

[0043] One of skill in the art would appreciate that any means forinhibiting the hairpin loop formation or the second strand cDNAsynthesis can be used for some embodiments of the invention. In aparticularly preferred embodiment, anti-tumor antibiotic, actinomycin D(AMD), is used to inhibit the hairpin formation and experiments haveshown that actinomycin reduced the number of second strand cDNAtranscripts by more than 64%. In some other embodiments, the addition ofsodium pyrophosphate to the first strand cDNA synthesis is used tosuppress hairpin formation. In additional embodiments. DMSO ofappropriate concentration (such as 15% DMSO) can be used to suppresssecond strand synthesis with no apparent decrease in first strandsynthesis (Gross, L. et. al. (1992) J. Mol. Biol. 228, 488, incorporatedherein by reference).

[0044] In one aspect of the invention, methods are provided fordetecting a plurality of transcripts without the interference of secondstrand DNA. The method include synthesizing a plurality of cDNAscomplementary with the transcripts by reverse transcription; where thesynthesis of second strand cDNA is inhibited; and hybridizing the cDNAsor nucleic acids derived from the cDNAs with a nucleic acid probe arrayto detect the transcripts. The methods are particularly suitable fordetecting a large number of, at least 100, 1000, or 10000, transcripts.Any suitable second strand cDNA synthesis inhibition methods aresuitable for use with at least some embodiments of the invention. Inparticularly preferred embodiment, hairpin loop formation inhibition isused to inhibit second strand cDNA synthesis. In one particularlypreferred embodiment, the synthesis of the second strand cDNA isinhibited by the presence of actinomycin D, DMSO or sodiumpyrophosphate. The cDNAs or nucleic acids derived from the cDNAs (e.g.,products of PCR amplification of the cDNAs, etc.) may be labeled withany suitable labels, such as radioactive labels, fluorescent labels, andchemoluminescent labels, etc.

[0045] The nucleic acid array can be a high density oligonucleotideprobe array with at least 400, 1000, 10000 probes per cm². In preferredembodiments, the array contains at least one probe against a targetsequence and one probe against the reverse complementary sequence of thetarget sequence. In more preferred embodiments, the array contains atleast 100 probes against at least 100 target sequences and at least 100probes against at least 100 reverse complementary sequences of thetarget sequences. In even more preferred embodiments, the arraycomprises at least 1000 or 3000 probes against at least 1000 or 3000target sequences and at least 1000 or 3000 probes against at least 1000or 3000 reverse complementary sequences of the target sequences.

[0046] In another aspect of the invention, methods are provided fordetecting the transcribed regions of a genome. The methods areparticularly suitable for analyzing regions where both strands of thegenomic DNA are transcribed. In preferred embodiments, the methodsinclude obtaining a sample containing transcripts transcribed from thegenome; synthesizing single stranded cDNAs complementary with thetranscripts, where the synthesis of second strand cDNA is inhibited; andhybridizing the cDNAs or nucleic acids derived from the cDNAs with anucleic acid probe array, where the nucleic acid probe array has probestargeting both strands of the genomic DNA in interested regions.

[0047] Any suitable second strand cDNA synthesis inhibition methods aresuitable for use with at least some embodiments of the invention. Inparticularly preferred embodiment, hairpin loop formation inhibition isused to inhibit second strand cDNA synthesis. In one particularlypreferred embodiment, the synthesis of the second strand cDNA isinhibited by the presence of actinomycin D. The cDNAs or nucleic acidsacids derived from the cDNAs (e.g., products of PCR amplification of thecDNAs, etc.) may be labeled with any suitable labels, such asradioactive labels, fluorescent labels, and chemoluminescent labels,etc.

[0048] The nucleic acid array can be a high density oligonucleotideprobe array with at least 400, 1000, 10000 probes per cm². In preferredembodiments, the array contains at least one probe against a targetsequence and one probe against the reverse complementary sequence of thetarget sequence. In more preferred embodiments, the array contains atleast 100 probes against at least 100 target sequences and at least 100probes against at least 100 reverse complementary sequences of thetarget sequences. In even more preferred embodiments, the arraycomprises at least 1000 or 3000 probes against at least 1000 or 3000target sequences and at least 1000 or 3000 probes against at least 1000or 3000 reverse complementary sequences of the target sequences.

[0049] In yet another aspect of the invention, an assay kit is provided.The kit contains reagents necessary for a reverse transcriptionreaction; an inhibitor of second strand cDNA synthesis; and a nucleicacid probe array. In preferred embodiments, the inhibitor is actinomycinD. The nucleic acid probe array is an oligonucleotide probe array thathas at least 400, 1000, 10000 probes per cm².

[0050] Sample Preparation and Hybridization

[0051] The methods of the invention are not limited to any particularmethod of sample preparation. A large number of well-known methods forisolating and purifying RNA are suitable for this invention.

[0052] One of skill in the art will appreciate that it is desirable tohave nucleic samples containing target nucleic acid sequences thatreflect the transcripts of interest. Therefore, suitable nucleic acidsamples may contain transcripts of interest. Suitable nucleic acidsamples, however, may also contain nucleic acids derived from thetranscripts of interest. As used herein, a nucleic acid derived from atranscript refers to a nucleic acid for whose synthesis the mRNAtranscript or a subsequence thereof has ultimately served as a template.Thus, a cDNA reverse transcribed from a transcript, an RNA transcribedfrom that cDNA, a DNA amplified from the cDNA, an RNA transcribed fromthe amplified DNA, etc., are all derived from the transcript anddetection of such derived products is indicative of the presence and/orabundance of the original transcript in a sample. Thus, suitable samplesinclude, but are not limited to, transcripts of the gene or genes, cDNAreverse transcribed from the transcript, cRNA transcribed from the cDNA,DNA amplified from the genes, RNA transcribed from amplified DNA, andthe like transcripts, as used herein, may include, but not limited topre-mRNA nascent transcript(s), transcript processing intermediates,mature mRNA(s) and degradation products. It is not necessary to monitorall types of transcripts to practice this invention. For example, onemay choose to practice the invention to measure the mature mRNA levelsonly.

[0053] In one embodiment, such a sample is a homogenate of cells ortissues or other biological samples. Preferably, such sample is a totalRNA preparation of a biological sample. More preferably in someembodiments, such a nucleic acid sample is the total mRNA isolated froma biological sample. Those of skill in the art will appreciate that thetotal mRNA prepared with most methods includes not only the mature mRNA,but also the RNA processing intermediates and nascent pre-mRNAtranscripts. For example, total mRNA purified with poly (T) columncontains RNA molecules with poly (A) tails. Those poly A+ RNA moleculescould be mature mRNA, RNA processing intermediates, nascent transcriptsor degradation intermediates.

[0054] Biological samples may be of any biological tissue or fluid orcells. Frequently the sample will be a “clinical sample” which is asample derived from a patient. Clinical samples provide a rich source ofinformation regarding the various states of genetic network or geneexpression. Some embodiments of the invention are employed to detectmutations and to identify the function of mutations. Such embodimentshave extensive applications in clinical diagnostics and clinicalstudies. Typical clinical samples include, but are not limited to,sputum, blood, blood cells (e.g., white cells), tissue or fine needlebiopsy samples, urine, peritoneal fluid, and pleural fluid, or cellstherefrom. Biological samples may also include sections of tissues suchas frozen sections taken for histological purposes.

[0055] Another typical source of biological samples are cell cultureswhere gene expression states can be manipulated to explore therelationship among genes. In one aspect of the invention, methods areprovided to generate biological samples reflecting a wide variety ofstates of the genetic network.

[0056] One of skill in the art would appreciate that it is desirable toinhibit or destroy RNase present in homogenates before homogenates canbe used for hybridization. Methods of inhibiting or destroying nucleasesare well known in the art. In some preferred embodiments, cells ortissues are homogenized in the presence of chaotropic agents to inhibitnuclease. In some other embodiments, RNase are inhibited or destroyed byheart treatment followed by proteinase treatment.

[0057] Methods of isolating total RNA and mRNA are also well known tothose of skill in the art. For example, methods of isolation andpurification of nucleic acids are described in detail in Chapter 3 ofLaboratory Techniques in Biochemistry and Molecular Biology:Hybridization With Nucleic Acid Probes, Part I. Theory and Nucleic AcidPreparation, P. Tijssen, ed. Elsevier, N.Y. (1993) and Chapter 3 ofLaboratory Techniques in Biochemistry and Molecular Biology:Hybridization With Nucleic Acid Probes, Part I. Theory and Nucleic AcidPreparation, P. Tijssen, ed. Elsevier, N.Y. (1993)).

[0058] In a preferred embodiment, the total RNA is isolated from a givensample using, for example, an acid guanidinium-phenol-chloroformextraction method and polyA+ mRNA is isolated by oligo (dT) columnchromatography or by using (dT) magnetic beads (see, e.g. , Sambrook etal., Molecular Cloning: A Laboratory Manual (2nd ed.), Vols. 1-3, ColdSpring Harbor Laboratory, (1989), or Current Protocols in MolecularBiology, F. Ausubel et al. , ed. Greene Publishing andWiley-Interscience, New York (1987)).

[0059] Most of eukaroytic mRNA have 3″ poly (A) tails, some ofeukaroytic and all of prokaroytic mRNA do not contain 3″ poly (A) tails.It is often desirable to isolate mRNAs from RNA samples.

[0060] In one particularly preferred embodiment, total RNA is isolatedfrom mammalian cells using RNeasy Total RNA isolation kit (QIAGEN). Ifmammalian tissue is used as the source of RNA, a commercial reagent suchas TRIzol Reagent (GIBCOL Life Technologies). A second cleanup after theethanol precipitation step in the TRIzol extraction using Rneasy totalRNA isolation kit may be beneficial.

[0061] Hot phenol protocol described by Schmitt, et al., (1990) NucleicAcid Res., 18:3091 -3092 is useful for isolating total RNA for yeastcells.

[0062] Good quality mRNA may be obtained by, for example, firstisolating total RNA and then isolating the mRNA from the total RNA usingOligotex mRNA kit (QIAGEN).

[0063] Total RNA from prokaryotes, such as E. coli. Cells, may beobtained by following the protocol for MasterPure complete DNA/RNApurification kit from Epicentre Technologies (Madison, Wis.).

[0064] Frequently, it is desirable to amplify the nucleic acid sampleprior to hybridization. One of skill in the art will appreciate thatwhatever amplification method is used, if a quantitative result isdesired, care must be taken to use a method that maintains or controlsfor the relative frequencies of the amplified nucleic acids to achievequantitative amplification.

[0065] Methods of “quantitative” amplification are well known to thoseof skill in the art. For example, quantitative PCR involvessimultaneously co-amplifying a known quantity of a control sequenceusing the same primers. This provides an internal standard that may beused to calibrate the PCR reaction. The high density array may theninclude probes specific to the internal standard for quantification ofthe amplified nucleic acid.

[0066] Other suitable amplification methods include, but are not limitedto polymerase chain reaction (PCR) (Innis, et al., PCR Protocols. Aguide to Methods and Application. Academic Press, Inc. San Diego,(1990)), ligase chain reaction (LCR) (see Wu and Wallace, Genomics, 4:560 (1989), Landegren, et al., Science, 241: 1077 (1988) and Barringer,et al., Gene, 89: 117 (1990), transcription amplification (Kwoh, et al.,Proc. Natl. Acad. Sci. USA, 86: 1173 (1989)), and self-sustainedsequence replication (Guatelli, et al., Proc. Nat. Acad. Sci. USA, 87:1874 (1990)).

[0067] Cell lysates or tissue homogenates often contain a number ofinhibitors of polymerase activity. Therefore, RT-PCR typicallyincorporates preliminary steps to isolate total RNA or mRNA forsubsequent use as an amplification template. One tube mRNA capturemethod may be used to prepare poly(A)+ RNA samples suitable forimmediate RT-PCR in the same tube (Boehringer Mannheim). The capturedmRNA can be directly subjected to RT-PCR by adding a reversetranscription mix and, subsequently, a PCR mix.

[0068] In a particularly preferred embodiment, the sample mRNA isreverse transcribed with a reverse transcriptase and a primer consistingof oligo dT, random hexamer, random nanomer or other primers and toprovide a single stranded DNA template. The reverse transcriptionreactions are preferred performed in a condition that suppresses thehairpin formation to reduce second strand cDNA synthesis. For example,actinomycin D (Actinomycin D with mannitol (Sigma) was dissolved inwater to a stock concentration of 1 mg/ml.) may be added before thereverse transcription reaction is initiated. One of skill in the artwould appreciate that the scope of the invention is not limited to theparticular concentration described herein. It is well within the skillof one of ordinary skills in the art to optimize assays by varying theconcentration of reagents according to the need to particular experimentpurpose and experimental conditions.

[0069] Before hybridization, the resulting cRNA or cDNA may befragmented. One preferred method for fragmentation employs Rnase freeRNA fragmentation buffer (200 mM tris-acetate, pH 8.1, 500 mM potassiumacetate, 150 mM magnesium acetate). Approximately 20 μg of cRNA is mixedwith 8 μL of the fragmentation buffer. Rnase free water is added to makethe volume to 40 μL. The mixture may be incubated at 94° C. for 35minutes and chilled in ice.

[0070] The biological sample should contain nucleic acids that reflectsthe level of at least some of the transcripts present in the cell,tissue or organ of the species of interest. In some embodiments, thebiological sample may be prepared from cell, tissue or organs of aparticular status. For example, a total RNA preparation from thepituitary of a dog when the dog is pregnant. In another example, samplesmay be prepared from E. coli cells after the cells are treated withIPTG. Because certain genes may only be expressed under certainconditions, biological samples derived under various conditions may beneeded to observe all transcripts. In some instance, the transcriptionalannotation may be specific for a particular physiological,pharmacological or toxicological condition. For example, certain regionsof a gene may only be transcribed under specific physiologicalconditions. Transcript annotation obtained using biological samples fromthe specific physiological conditions may not be applicable to otherphysiological conditions.

[0071] Nucleic acid hybridization simply involves contacting a probe andtarget nucleic acid under conditions where the probe and itscomplementary target can form stable hybrid duplexes throughcomplementary base pairing.

[0072] It is generally recognized that nucleic acids are denatured byincreasing the temperature or decreasing the salt concentration of thebuffer containing the nucleic acids. Under low stringency conditions(e.g., low temperature and/or high salt) hybrid duplexes (e.g., DNA:DNA,RNA:RNA, or RNA:DNA) will form even where the annealed sequences are notperfectly complementary. Thus specificity of hybridization is reduced atlower stringency. Conversely, at higher stringency (e.g., highertemperature or lower salt) successful hybridization requires fewermismatches.

[0073] One of skill in the art will appreciate that hybridizationconditions may be selected to provide any degree of stringency. In apreferred embodiment, hybridization is performed at low stringency inthis case in 6×SSPE-T at 37 C (0.005% Triton X-100) to ensurehybridization and then subsequent washes are performed at higherstringency (e.g., 1×SSPE-T at 37 C) to eliminate mismatched hybridduplexes. Successive washes may be performed at increasingly higherstringency (e.g., down to as low as 0.25×SSPE-T at 37 C to 50 C) until adesired level of hybridization specificity is obtained. Stringency canalso be increased by addition of agents such as formamide. Hybridizationspecificity may be evaluated by comparison of hybridization to the testprobes with hybridization to the various controls that can be present(e.g., expression level control, normalization control, mismatchcontrols, etc.).

[0074] In general, there is a tradeoff between hybridization specificity(stringency) and signal intensity. Thus, in a preferred embodiment, thewash is performed at the highest stringency that produces consistentresults and that provides a signal intensity greater than approximately10% of the background intensity. Thus, in a preferred embodiment, thehybridized array may be washed at successively higher stringencysolutions and read between each wash. Analysis of the data sets thusproduced will reveal a wash stringency above which the hybridizationpattern is not appreciably altered and which provides adequate signalfor the particular oligonucleotide probes of interest.

[0075] Altering the thermal stability (Tm) of the duplex formed betweenthe target and the probe using, e.g., known oligonucleotide analoguesallows for optimization of duplex stability and mismatch discrimination.One useful aspect of altering the Tm arises from the fact thatadenine-thymine (A-T) duplexes have a lower Tm than guanine-cytosine(G-C) duplexes, due in part to the fact that the A-T duplexes have 2hydrogen bonds per base-pair, while the G-C duplexes have 3 hydrogenbonds per base pair. In heterogeneous oligonucleotide arrays in whichthere is a non-uniform distribution of bases, it is not generallypossible to optimize hybridization for each oligonucleotide probesimultaneously. Thus, in some embodiments, it is desirable toselectively destabilize G-C duplexes and/or to increase the stability ofA-T duplexes. This can be accomplished, e.g., by substituting guanineresidues in the probes of an array which form G-C duplexes withhypoxanthine, or by substituting adenine residues in probes which formA-T duplexes with 2,6 diaminopurine or by using the salt tetramethylammonium chloride (TMACl) in place of NaCl.

[0076] Methods of optimizing hybridization conditions are well known tothose of skill in the art (see, e.g., Laboratory Techniques inBiochemistry and Molecular Biology, Vol. 24: Hybridization With NucleicAcid Probes, P. Tijssen, ed. Elsevier, N.Y., (1993)).

[0077] Signal Detection and Data Analysis

[0078] In a preferred embodiment, the hybridized nucleic acids aredetected by detecting one or more labels attached to the sample nucleicacids. The labels may be incorporated by any of a number of means wellknown to those of skill in the art. However, in a preferred embodiment,the label is simultaneously incorporated during the amplification stepin the preparation of the sample nucleic acids. Thus, for example,polymerase chain reaction (PCR) with labeled primers or labelednucleotides will provide a labeled amplification product. In a preferredembodiment, transcription amplification, as described above, using alabeled nucleotide (e.g. fluorescein-labeled UTP and/or CTP)incorporates a label into the transcribed nucleic acids. Alternatively,cDNAs synthesized using a RNA sample as a template, cRNAs aresynthesized using the cDNAs as templates using in vitro transcription(IVT). A biotin label may be incorporated during the IVT reaction (EnzoBioarray high yield labeling kit).

[0079] Alternatively, a label may be added directly to the originalnucleic acid sample (e.g., mRNA, polyA mRNA, cDNA, etc.) or to theamplification product after the amplification is completed. Means ofattaching labels to nucleic acids are well known to those of skill inthe art and include, for example nick translation or end-labeling (e.g.with a labeled RNA) by kinasing of the nucleic acid and subsequentattachment (ligation) of a nucleic acid linker joining the samplenucleic acid to a label (e.g., a fluorophore).

[0080] Detectable labels suitable for use in the present inventioninclude any composition detectable by spectroscopic, photochemical,biochemical, immunochemical, electrical, optical or chemical means.Useful labels in the present invention include biotin for staining withlabeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™),fluorescent dyes (e.g., fluorescein, texas red, rhodamine, greenfluorescent protein, and the like), radiolabels (e.g., 3H, 1251, 35S,14C, or 32P), enzymes (e.g., horse radish peroxidase, alkalinephosphatase and others commonly used in an ELISA), and colorimetriclabels such as colloidal gold or colored glass or plastic (e.g.,polystyrene, polypropylene, latex, etc.) beads. Patents teaching the useof such labels include U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350;3,996,345; 4,277,437; 4,275,149; and 4,366,241.

[0081] Means of detecting such labels are well known to those of skillin the art. Thus, for example, radiolabels may be detected usingphotographic film or scintillation counters, fluorescent markers may bedetected using a photodetector to detect emitted light. Enzymatic labelsare typically detected by providing the enzyme with a substrate anddetecting the reaction product produced by the action of the enzyme onthe substrate, and calorimetric labels are detected by simplyvisualizing the colored label. One particularly preferred method usescolloidal gold label that can be detected by measuring scattered light.

[0082] The label may be added to the target (sample) nucleic acid(s)prior to, or after the hybridization. So called “direct labels” aredetectable labels that are directly attached to or incorporated into thetarget (sample) nucleic acid prior to hybridization. In contrast, socalled “indirect labels” are joined to the hybrid duplex afterhybridization. Often, the indirect label is attached to a binding moietythat has been attached to the target nucleic acid prior to thehybridization. Thus, for example, the target nucleic acid may bebiotinylated before the hybridization. After hybridization, anaviden-conjugated fluorophore will bind the biotin bearing hybridduplexes providing a label that is easily detected. For a detailedreview of methods of labeling nucleic acids and detecting labeledhybridized nucleic acids see Laboratory Techniques in Biochemistry andMolecular Biology, Vol. 24: Hybridization With Nucleic Acid Probes, P.Tijssen, ed. Elsevier, N.Y., (1993)).

[0083] Fluorescent labels are preferred and easily added during an invitro transcription reaction. In a preferred embodiment, fluoresceinlabeled UTP and CTP are incorporated into the RNA produced in an invitro transcription reaction as described above.

[0084] Means of detecting labeled target (sample) nucleic acidshybridized to the probes of the high density array are known to those ofskill in the art. Thus, for example, where a calorimetric label is used,simple visualization of the label is sufficient. Where a radioactivelabeled probe is used, detection of the radiation (e.g. withphotographic film or a solid state detector) is sufficient.

[0085] In a preferred embodiment, however, the target nucleic acids arelabeled with a fluorescent label and the localization of the label onthe probe array is accomplished with fluorescent microscopy. Thehybridized array is excited with a light source at the excitationwavelength of the particular fluorescent label and the resultingfluorescence at the emission wavelength is detected. In a particularlypreferred embodiment, the excitation light source is a laser appropriatefor the excitation of the fluorescent label.

[0086] The confocal microscope may be automated with acomputer-controlled stage to automatically scan the entire high densityarray. Similarly, the microscope may be equipped with a phototransducer(e.g., a photomultiplier, a solid state array, a CCD camera, etc.)attached to an automated data acquisition system to automatically recordthe fluorescence signal produced by hybridization to eacholigonucleotide probe on the array. Such automated systems are describedat length in U.S. Pat. No. 5,143,854, PCT Application 20 92/10092, andU.S. application Ser. No. 08/195,889 filed on Feb. 10, 1994. Use oflaser illumination in conjunction with automated confocal microscopy forsignal detection permits detection at a resolution of better than about100 μm, more preferably better than about 50 μm, and most preferablybetter than about 25 μm.

[0087] One of skill in the art will appreciate that methods forevaluating the hybridization results vary with the nature of thespecific probe nucleic acids used as well as the controls provided. Inthe simplest embodiment, simple quantification of the fluorescenceintensity for each probe is determined. This is accomplished simply bymeasuring probe signal strength at each location (representing adifferent probe) on the high density array ( e.g. , where the label is afluorescent label, detection of the amount of florescence (intensity)produced by a fixed excitation illumination at each location on thearray). Comparison of the absolute intensities of an array hybridized tonucleic acids from a “test” sample with intensities produced by a“control” sample provides a measure of the relative expression of thenucleic acids that hybridize to each of the probes.

[0088] One of skill in the art, however, will appreciate thathybridization signals will vary in strength with efficiency ofhybridization, the amount of label on the sample nucleic acid and theamount of the particular nucleic acid in the sample. Typically nucleicacids present at very low levels ( e.g. , <1 pM) will show a very weaksignal. At some low level of concentration, the signal becomes virtuallyindistinguishable from the background. In evaluating the hybridizationdata, a threshold intensity value may be selected below which a signalis not counted as being essentially indistinguishable from thebackground.

[0089] Suitable scanners, computer software for controlling the scannersand computer software for data management and analysis are availablefrom commercial sources, such as Affymetrix, Inc., Santa Clara, Calif.

EXAMPLE

[0090] This example illustrates one embodiment of the invention.

[0091] Materials and Methods Bacterial growth conditions. A singlecolony of E. coli K-12 (MG1655) was inoculated in 5 ml of LuriaBertani(LB) broth and grown overnight with constant aeration at 37° C. The nextday 20 ml of LB broth was inoculated with 0.2 ml of the overnightculture and grown at 37° C. with constant aeration to an optical density(OD₆₀₀ ) of 0.8. The cells were incubated for 30 min before RNAisolation. RNA isolation. Total RNA was isolated from the cells usingthe protocol accompanying the MasterPure complete DNA/RNA purificationkit from Epicentre Technologies (Madison, Wis.). Isolated RNA wasresuspended in diethylpyrocarbonate (DEPC)-treated water, quantitatedbased on absorption at 260 nm and stored in aliquots at 20° C. untilfurther use.

[0092] mRNA enrichment and labeling. Enrichment of mRNA was done asdescribed in the Affymetrix Expression Technical Manual (AffymetrixInc., Santa Clara, Calif.). In brief, a set of oligonucleotide primersspecific for either 16S or 23S rRNA are mixed with total RNA isolatedfrom bacterial cultures. After annealing at 70° C. for 5 min, 300 U MMLVreverse transcriptase (Epicentre Technologies, Madison, Wis.) is addedto synthesize cDNA strands complementary to the two rRNA species. ThecDNA strand synthesis allows for selective degradation of the 16S and23S rRNAs by RNase H. Treatment of the RNA/cDNA mixture with DNase I(Amersham Pharmacia Biotech, Piscataway, N.J.) removes the cDNAmolecules and oligonucleotide primers, which results in an RNApreparation that is enriched for mRNA by 80% (data not shown). Fordirect labeling of RNA, 20 μg enriched bacterial RNA was fragmented at95° C. for 30 min in a total volume of 88 μl of 1×NEB buffer for T4polynucleotide kinase (New England Biolabs, Beverly, Mass.). Aftercooling to 4° C., 50 μM —S-ATP (Roche Molecular Biochemicals,Indianapolis, Ind.) and 100 U T4-polynucleotide kinase (Roche MolecularBiochemicals) was added to the fragmented RNA and the reaction wasincubated at 37° C. for 50 min. To inactivate T4 polynucleotide kinase,the reaction was incubated for 10 min at 65° C. and the RNA wassubsequently ethanol precipitated to remove excess —S-ATP. Aftercentrifugation the RNA pellet was resuspended in 96 μl of 30 mM MOPS, pH7.5, and 4 μl of a 50 mM PEO-iodoacetylbiotin (Pierce Chemical,Rockford, Ill.) solution was added to introduce the biotin label. Thereaction was incubated at 37° C. for 1 h and the labeled RNA waspurified using the RNA/DNA Mini-Kit from Qiagen (Valencia, Calif.) asrecommended by the manufacturer. Eluted RNA was quantitated based on theabsorption at 260 nm and hybridized to the oligonucleotide array. cDNASynthesis and Labeling For the cDNA synthesis method, 10 μg total RNAwas reversetranscribed using the SuperScript II system for first strandcDNA synthesis from Life Technologies (Rockville, Md.). For thereaction, 500 ng random hexamers were mixed with the RNA ina totalvolume of 12 μl and heated to 70° C. for 10 min. After cooling to 25° C.within 10 min, the reaction buffer was added according to themanufacturer″s recommendations. After increasing the temperature to 42°C. within 10 min, 1800 U SuperScript II was added to the reaction andincubated for 50 min. SuperScript II was heat inactivated at 72° C. for15 min and the mixture cooled to 4° C. RNA was removed using 2 U RNase H(Life Technologies) and 1 μg RNase A (Epicentre, Madison, Wis.) for 10min at 37° C. in 100 μl total volume. The cDNA was purified using theQiaQuick PCR purification kit from Qiagen (Valencia, Calif.). IsolatedcDNA was quantitated based on the absorption at 260 nm and fragmentedusing a partial DNase I digest. For up to 5 μg isolated cDNA, 0.2 UDNase I (Roche Molecular Biochemicals) was added and incubated for 10min at 37° C. in 1× One-Phor-All buffer (Amersham Pharmacia Biotech) andthe reaction stopped by incubation at 99° C. for 10 min. Thefragmentation was confirmed on a 0.7% agarose gel to verify that thefragments had an average length of 50100 bp. The fragmented cDNA was3′-end-labeled for 2 h at 37° C. using 175 U terminal transferase (RocheMolecular Biochemicals) and 70 μM biotin-N6-ddATP (DuPont/NEN, Boston,Mass.) in 1×TdT buffer (0.2 M potassium cacodylate, 25 mM TrisHCl, 0.25mg /ml BSA, pH 6.6; Roche Molecular Biochemicals) and 2.5 mM cobaltchloride. The fragmented and end-labeled cDNA was added to thehybridization solution without further purification. In someexperiments, actinomycin D with mannitol (Sigma) was dissolved in waterto a stock concentration of 1 mg/ml. The absorbance at 440 nm was usedto determine the final concentration of 50 ug/ml actionmycin D and wasadded to the reverse transcription reaction before addition of theSuperscript II.

[0093] Oligonucleotide Probe Array. On the oligonucleotide arrays agiven gene and lg region is represented by 15 different 25meroligonucleotides that are designed to be complementary to the targetsequence and serve as unique, sequence-specific detectors (termedperfect match probes). An additional control element on these arrays isthe use of mismatch (MM) control probes that are designed to beidentical to their perfect match (PM) partners except for a single basedifference in the central position. The presence of the MMoligonucleotide allows cross-hybridization and local background to beestimated and subtracted from the PM signal. For a given transcript thenumbers of positive and negative probe pairs, as well as the PM and MMintensities, are used to determine whether a transcript is present (P),marginal (M) or absent (A). A probe pair is called positive when theintensity of the PM probe cell is significantly greater than that of thecorresponding MM probe cell; a probe pair is called negative if thesituation is reversed. The average difference (Avg Diff) of all 15probes in a probe set is used to determine the level of expression of atranscript and is calculated by taking the difference between the PM andMM of every probe and averaging the differences over the entire probeset, with some trimming of outlier values. Array hybridization andscanning. The hybridization solution contained 100 mM MES, 1 M NaCl, 20mM EDTA and 0.01% Tween 20, pH 6.6 (referred to as 1×MES). In addition,the solution contained 0.1 mg ml herring sperm DNA, 0.5 mg ml BSA and0.5 nM control Biotin-oligo 948. Samples were heated to 99° C. for 5min, followed by 45° C. for an additional 5 min before being placed inthe array cartridge. Hybridization was carried out at 45° C. for 16 hwith mixing on a rotary mixer at 60 r.p.m. Following hybridization, thesample solution was removed and the array was washed and stained asrecommended in the technical manual (Affymetrix Inc.). In brief, toenhance the signals 10 μg/ml streptavidin and 2 mg/ml BSA in 1×MES wasused as the first staining solution. After the streptavidin solution wasremoved, an antibody mix was added as the second stain, containing 0.1mg/ml goat IgG, 5 μg/ml biotin-boundanti-streptavidin antibody and 2mg/ml BSA in 1×MES. Nucleic acid was fluorescently labeled by incubationwith 10 μg/ml streptavidinphycoerythrin (Molecuar Probes, Eugene, Oreg.)and 2 mg ml BSA in 1MES. The arrays wereread at 570 nm with a resolutionof 3 μm using a confocallaser scanner (Affymetrix Inc.). Results Theaddition of actinomycin D to the cDNA reaction did not significantlyaffect first-strand synthesis but significantly caused the number ofpresent calls to decrease by 64% on the sense array, indicatingsecond-strand inhibition (See, table 1). The remaining genes that werestill present on the sense array were then studied. It was found that67% of these genes were also present on the sense chip indicating analternative mechanism for second strand cDNA synthesis or antisensetranscripts. The other 32% were not present on the antisense arrays andare thought to be candidates for antisense RNAs. Our results allowreverse transcription to be studied on a global level, not onlyelucidating that the hairpin structure is the primary source of primingfor second-strand cDNA, but also allowing the identification ofpotential antisense transcripts. TABLE 1 Effect of Actinomycin D onDetection of Transcripts No. of Mean average difference of ArrayExperiments Calls present calls. Antisense Actinomycin D 2574 2300Antisense No 2396 2320 Actinomycin Sense Actinomycin D 950 1100 Sense No432 1000 Actinomycin

CONCLUSION

[0094] The present inventions provide methods for analyzing a largenumber of RNAs. It is to be understood that the above descripis intendedto be illustrative and not restrictive. Many variations of the inventionwill be apparent to those of skill in the art upon reviewing the abovedescription. By way of example, the invention has been describedprimarily with reference to the use of a high density oligonucleotidearray, but it will be readily recognized by those of skill in the artthat other nucleic acid arrays are also within the scope of theinvention. The scope of the invenshould be deterwith reference to theappended claims, along with the full scope of equivalents to which suchclaims are entitled. All cited references, including patent andnon-patent literature, are incorporated herewith by reference in theirentireties for all purposes.

1. A method for detecting a plurality of transcripts comprising:synthesizing a plurality of cDNAs complementary with the transcripts byreverse transcription; wherein the synthesis of second strand cDNA isinhibited; and hybridizing the cDNAs or nucleic acids derived from thecDNAs with a nucleic acid probe array to detect the transcripts.
 2. Themethod of claim 1 wherein the synthesis of the second strand cDNA isinhibited by the presence of actinomycin.
 3. The method of claim 2wherein the cDNAs or nucleic acids derived from the cDNAs are labeled.4. The method of claim 2 wherein the nucleic acid probe array is anoligonucleotide probe array.
 5. The method of claim 4 wherein thenucleic acid probe array has at least 400 probes per cm².
 6. The methodof claim 5 wherein the nucleic acid probe array has at least 1000 probesper cm².
 7. The method of claim 6 wherein the nucleic acid probe arrayhas at least 10000 probes per cm².
 8. The method of claim 4 wherein thenucleic acid probe array comprises at least least one probe against atarget sequence and one probe against the reverse complementary sequenceof the target sequence.
 9. The method of claim 8 wherein the nucleicacid probe array comprises at least least 100 probes against at least100 target sequences and at least 100 probes against at least 100reverse complementary sequences of the target sequences.
 10. The methodof claim 9 wherein the nucleic acid probe array comprises at least 1000probes against at least 1000 target sequences and at least 1000 probesagainst at least 1000 reverse complementary sequences of the targetsequences.
 11. The method of claim 10 wherein the nucleic acid probearray comprises at least 3000 probes against at least 3000 targetsequences and at least 3000 probes against at least 3000 reversecomplementary sequences of the target sequences.
 12. A method fordetecting transcribed regions of a genome comprising obtaining a samplecomprising transcripts transcribed from the genome; synthesizing singlestranded cDNAs complementary with the transcripts, wherein the synthesisof second strand cDNA is inhibited; and hybridizing the cDNAs or nucleicacids derived from the cDNAs with a nucleic acid probe array, whereinthe nucleic acid probe array has probes targeting both strands of thegenomic DNA in interested regions.
 13. The method of claim 12 whereinthe synthesis of the second strand cDNA is inhibited by the presence ofactinomycin.
 14. The method of claim 13 wherein the cDNAs or nucleicacids derived from the cDNAs are labeled.
 15. The method of claim 14wherein the nucleic acid probe array is an oligonucleotide probe array.16. The method of claim 15 wherein the nucleic acid probe array has atleast 400 probes per cm².
 17. The method of claim 16 wherein the nucleicacid probe array has at least 1000 probes per cm².
 18. The method ofclaim 17 wherein the nucleic acid probe array has at least 10000 probesper cm².
 19. The method of claim 1 2 further comprising determining thetemplate strand for at least one transcript, and wherein the probe arraycontains probes against both strand of the genomic DNA region where thetranscript is transcribed.
 20. An assay kit comprising: reagentsnecessary for a reverse transcription reaction; an inhibitor of secondstrand cDNA synthesis and a nucleic acid probe array.
 21. The kit ofclaim 20 wherein the inhibitor is actinomycin D.
 22. The kit of claim 21wherein the nucleic acid probe array is an oligonucleotide probe array.23. The kit of claim 22 wherein the nucleic acid probe array has atleast 400 probes per cm².
 24. The kit of claim 23 wherein the nucleicacid probe array has at least 1000 probes per cm².
 25. The kit of claim24 wherein the nucleic acid probe array has at least 10000 probes percm².
 26. The kit of claim 25 wherein the nucleic acid probe arraycomprises at least one probe against a target sequence and one probeagainst the reverse complementary sequence of the target sequence. 27.The kit of claim 26 wherein the nucleic acid probe array comprises atleast 100 probes against at least 100 target sequences and at least 100probes against at least 100 reverse complementary sequences of thetarget sequences.
 28. The kit of claim 27 wherein the nucleic acid probearray comprises at least 1000 probes against at least 1000 targetsequences and at least 1000 probes against at least 1000 reversecomplementary sequences of the target sequences.
 29. The kit of claim 28wherein the nucleic acid probe array comprises at least 3000 probesagainst at least 3000 target sequences and at least 3000 probes againstat least 3000 reverse complementary sequences of the target sequences.